STATS 32 Session 6: Reproducible Research

Kenneth Tay

Oct 18, 2018

Recap of session 5

Reminder: Project proposals are due tomorrow night!

Agenda for today

Reproducible research: what & why

Reproducible research: publishing data analyses together with their data and code so that others may “reproduce” the findings.

Why reproducible research?

R scripts

R markdown

RStudio: R markdown is a document format which allows you to “weave together narrative text and code to produce elegantly formatted output.”

Made possible by the knitr package (Yihui Xie)

(Source: Vimeo)

R markdown: output (1)

R markdown: output (2)

R markdown: output (3)

R markdown: input

R markdown: more details

Surprise: (Almost) all the class material (including slides) was created with R markdown!

Quick intro to Markdown

Markdown is a simple way to convert a text document into a web file (i.e. HTML) with basic styling.

Has support for:

Markdown reference here.

To see how your Markdown (.md) document looks like in real-time, use an online Markdown editor (e.g. dillinger.io)

Today’s dataset: 2016 US Presidential Elections

(Source: Christianity Today)









Optional material

Rmd workflow (basic)

  1. Edit .Rmd file in RStudio.
  2. Knit the document (either by hitting the “Knit” button or using a keyboard shortcut).
    • When you press “Knit”, the file is automatically saved.
    • Next, RStudio opens a new console, “knits” the document there, then closes that console. No code is run in your original console!
    • RStudio creates a .html file in the same folder as the .Rmd file.
  3. Preview output in the preview pane, or by opening the .html file.
    • If you want to make changes, go back to Step 1.

Rmd workflow (advanced)

  1. Edit .Rmd file in RStudio. As you are typing code in the .Rmd file, enter this code into the RStudio console to see if it works.
    • If the code doesn’t work, keeping editing the .Rmd file until it does.
  2. Periodically knit the document.
    • When you press “Knit”, the file is automatically saved.
    • Next, RStudio opens a new console, “knits” the document there, then closes that console. No code is run in your original console!
    • If you’ve done everything correctly, it should knit properly.
    • If it doesn’t, it’s probably because you made code changes in the RStudio console which are not reflected in the .Rmd file. Use rm(list = ls()) to empty your environment, run the code in your .Rmd file sequentially in the RStudio console to see what went wrong.

Common Rmd chunk options